National Repository of Grey Literature 1 records found  Search took 0.01 seconds. 
Analysis and visualization of the GPT-2 language model
Šipoš, Daniel ; Mareček, David (advisor) ; Rosa, Rudolf (referee)
Visualization of deep neural network models with Transformer architecture is generally a very demanding task which is usually solved by visualizing attention blocks and moni- toring which words these block focus on. However, Transformer models have many layers and there are multiple attention heads on each layer. Therefore, each head may attend to different linguistic features. In this work, we focus on developing an application that is designed to visualize the behaviour of GPT-2 language models more clearly. We propose four visualization methods that examine the dependencies of generated words on pre- vious words in the text. We monitor these dependencies by removing one of the words in the previously generated text or replacing it with a similar word and then observing changes of the probability of the generated word. We show the results of our methods produced on the GPT-2 Medium model and formulate hypotheses with the aim to explain them. 1

Interested in being notified about new results for this query?
Subscribe to the RSS feed.